Kannada Part-Of-Speech Tagging with Probabilistic Classifiers
نویسندگان
چکیده
منابع مشابه
Selective Classifiers for Part-of-Speech Tagging
We investigate the use of selective classifiers for part-of-speech tagging (POS). The idea is to allow classifiers to abstain on hard instances, passing them to downstream classifiers that may have more context available. In this report we focus on just the first stage of such a cascade, and ask whether selective classifiers attain the accuracies needed on those instances they accept, given tha...
متن کاملA Maximum Entropy Approach to Kannada Part Of Speech Tagging
Part Of Speech (POS) tagging is the most important pre-processing step in almost all Natural Language Processing (NLP) applications. It is defined as the process of classifying each word in a text with its appropriate part of speech. In this paper, the probabilistic classifier technique of Maximum Entropy model is experimented for the tagging of Kannada sentences. Kannada language is agglutinat...
متن کاملFast High-Accuracy Part-of-Speech Tagging by Independent Classifiers
Part-of-speech (POS) taggers can be quite accurate, but for practical use, accuracy often has to be sacrificed for speed. For example, the maintainers of the Stanford tagger (Toutanova et al., 2003; Manning, 2011) recommend tagging with a model whose per tag error rate is 17% higher, relatively, than their most accurate model, to gain a factor of 10 or more in speed. In this paper, we treat POS...
متن کاملProbabilistic Part-of-Speech Tagging Using Decision Trees
In this paper, a new probabilistic tagging method is presented which avoids problems that Markov Model based taggers face, when they have to estimate transition probabilities from sparse data. In this tagging method, transition probabilities are estimated using a decision tree. Based on this method, a part-of-speech tagger (called TreeTagger) has been implemented which achieves 96.36 % accuracy...
متن کاملProbabilistic Part Of Speech Tagging for Bahasa Indonesia
In this paper we report our work in developing Part of Speech Tagging for Bahasa Indonesia using probabilistic approaches. We use Condtional Random Fields (CRF) and Maximum Entropy methods in assigning the tag to a word. We use two tagsets containing 37 and 25 part-of-speech tags for Bahasa Indonesia. In this work we compared both methods using using two different corpora. The results of the ex...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: International Journal of Computer Applications
سال: 2012
ISSN: 0975-8887
DOI: 10.5120/7442-0452